Add tutorial notebook for Two-Stage DiD (Gardner 2022) by igerber · Pull Request #159 · igerber/diff-diff

igerber · 2026-02-16T20:55:34Z

Summary

Add docs/tutorials/12_two_stage_did.ipynb — tutorial notebook for TwoStageDiD estimator (Gardner 2022)
Update CLAUDE.md — add tutorials 11 (ImputationDiD) and 12 (TwoStageDiD) to tutorials listing

Methodology references (required if estimator / math changes)

N/A — no methodology changes, documentation/tutorial only

Validation

Tests added/updated: No test changes (tutorial notebook only)
Backtest / simulation / notebook evidence: Notebook executes cleanly via jupyter execute — all 18 cells pass. Key validation: TwoStageDiD and ImputationDiD ATTs match exactly (2.420), confirming point-estimate identity.

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

New tutorial 12 covering TwoStageDiD estimator: basic usage, event study, per-observation treatment effects, three-estimator comparison (TwoStageDiD vs ImputationDiD vs CallawaySantAnna), group aggregation, anticipation, and GMM vs conservative variance. Also adds tutorials 11 and 12 to CLAUDE.md listing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-16T20:58:36Z

Overall Assessment: ⚠️ Needs changes

Executive Summary

Doc-only PR adds the TwoStageDiD tutorial notebook and updates the tutorial list.
Two methodology statements in the new notebook conflict with the Methodology Registry/implementation and should be corrected (CallawaySantAnna inference and treatment_effects weights).
No code paths changed; edge-case checklist is not applicable beyond documentation accuracy.
Tests not run (doc-only).

Methodology

Severity: P1. Impact: The notebook states CallawaySantAnna “standard errors come from an analytical multiplier bootstrap,” which misrepresents the default inference path (analytical influence-function SEs by default; multiplier bootstrap is optional). This can mislead users comparing inference across estimators. Fix: Replace with “Analytical influence-function SEs by default; optional multiplier bootstrap when n_bootstrap > 0.” Location: docs/tutorials/12_two_stage_did.ipynb:L131-L133.
Severity: P1. Impact: The notebook claims treatment_effects.weight is 1/n_treated, but the implementation assigns 1/n_valid for finite tau_hat and 0 for NaN rows, which matters in rank-deficient/NaN cases. Users doing custom aggregation could apply incorrect weights. Fix: Update text to “weight = 1/n_valid for finite tau_hat (0 for NaN)” or explicitly note NaN handling. Location: docs/tutorials/12_two_stage_did.ipynb:L100-L106 (implementation: diff_diff/two_stage.py:L778-L785).

Code Quality
No findings.

Performance
No findings.

Maintainability
No findings.

Tech Debt
No findings.

Security
No findings.

Documentation/Tests

Severity: P3. Impact: The new notebook’s execution is not validated in CI, so future regressions could slip in unnoticed. Fix: Consider adding notebook execution to CI or document a local validation step (e.g., jupyter execute) in contributor docs. Location: docs/tutorials/12_two_stage_did.ipynb.

- CallawaySantAnna inference: clarify analytical influence-function SEs by default, optional multiplier bootstrap when n_bootstrap > 0 - treatment_effects.weight: correct to 1/n_valid for finite tau_hat, 0 for NaN rows (not 1/n_treated) - Summary table: update CS variance description for consistency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-16T21:03:20Z

/ai-review

github-actions · 2026-02-16T21:05:58Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 9cdfa17a139ea12245acd52cd14b4e10ea15cd79

Overall assessment: ⚠️ Needs changes

Executive summary

Doc-only PR; no estimator or inference code changes.
Tutorial contains an accuracy issue: it says per-observation treatment_effects is unique to TwoStageDiD and that ImputationDiD lacks it, which contradicts the existing ImputationDiD API.
Update the tutorial text and summary table to reflect that ImputationDiD also exposes treatment_effects.

Methodology

No findings.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

Severity: P2. Impact: The tutorial claims treatment_effects is unique to TwoStageDiD and the summary table says ImputationDiD has no per-observation effects, but ImputationDiDResults includes treatment_effects. This misleads users about available outputs. Fix: Update the “Per-Observation Treatment Effects” section to remove “unique” and adjust the summary table to show ImputationDiD = Yes (or explicitly describe differences in weights/variance). Location: docs/tutorials/12_two_stage_did.ipynb:L97-L119, docs/tutorials/12_two_stage_did.ipynb:L238-L240. Evidence: diff_diff/imputation.py:L90-L98.
Severity: P3. Impact: Tests not run (docs-only). Fix: If you want execution assurance, run the notebook via the standard doc execution flow (e.g., jupyter execute). Location: N/A.

Both TwoStageDiD and ImputationDiD provide treatment_effects DataFrame — remove incorrect "unique to TwoStageDiD" language and update summary table accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-16T21:08:03Z

/ai-review

github-actions · 2026-02-16T21:10:40Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 113354e73eaef89a386a6d360a02955c9141bd5f

Overall Assessment
✅ Looks good

Executive Summary

Adds a new tutorial notebook at docs/tutorials/12_two_stage_did.ipynb and updates the tutorial index in CLAUDE.md.
Methodology cross‑check: TwoStageDiD two‑stage procedure, point‑estimate identity with ImputationDiD, and GMM sandwich variance statements align with docs/methodology/REGISTRY.md and diff_diff/two_stage.py.
No estimator or inference code changes; edge‑case checklist not triggered.
Tests not run (doc‑only change).

Methodology

No findings.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

No findings.

Address tech debt from code reviews (PRs #115-#159)

Fix treatment_effects availability claim per PR review

113354e

Both TwoStageDiD and ImputationDiD provide treatment_effects DataFrame — remove incorrect "unique to TwoStageDiD" language and update summary table accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber merged commit 890f414 into main Feb 16, 2026

igerber deleted the two-stage-notebook branch February 16, 2026 21:11

igerber mentioned this pull request Feb 16, 2026

Update TODO.md and ROADMAP.md for accuracy post-v2.4.0 #160

Merged

igerber added a commit that referenced this pull request Feb 17, 2026

Merge pull request #165 from igerber/tech-debt-paydown

b1e0237

Address tech debt from code reviews (PRs #115-#159)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tutorial notebook for Two-Stage DiD (Gardner 2022)#159

Add tutorial notebook for Two-Stage DiD (Gardner 2022)#159
igerber merged 3 commits intomainfrom
two-stage-notebook

igerber commented Feb 16, 2026

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

igerber commented Feb 16, 2026

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

igerber commented Feb 16, 2026

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Feb 16, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

igerber commented Feb 16, 2026

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

igerber commented Feb 16, 2026

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant